00:00
2026-05-14
maltebuettner.eu
large-language-models
documentai bbox benchmark
Malte Buettner benchmarked bounding box accuracy for Document AI models using pages from the FlashAttention-3 paper, testing Qwen, Kimi, and Mistral via OpenRouter. The evaluation scored models on covโฆ